{
 "metadata": {
  "name": ""
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# Translating Python functions to Javascript at runtime"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We describe a simple virtual machine and intermediate language used to represent Python functions in a language-independent way. This allows us to translate these functions in other dynamical languages such as Javascript.\n",
      "\n",
      "This method is not generic and is severely limited in the functions that can be handled. It can only work on functions that are *effectively* simple in the operations they perform on their inputs. In particular, conditional branching is not supported within the functions (but it *is* supported within functions of the instruction set). Also, no dynamic feature (with respect to the input variables) is supported."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Finally, this method requires to reimplement a set of Python functions in Javascript. Our main argument is that this set is small and only contains simple functions (Python & NumPy), so that this method is practically useful for the applications we target. Redefining a small set of basic Python functions in Javascript is a small price to pay to be able to convert most Python functions we are interested in, to Javascript, at runtime, and without access to the source code."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Description of the virtual machine and the intermediate language"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Our **virtual machine** consists of an **instruction set** $I = \\{\\phi_1, \\ldots, \\phi_P\\}$ of functions."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The representation we want to compile from runtime Python code consists of a set of **target Python functions** $F = \\{f_1, \\ldots, f_S\\}$. One of these functions could implement a symbol table, and return a variable from its name. These functions do not have to return a value: we assume that they implement the functionality we want (in other words, they have side effects, and those are what we are interested in)."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Context"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The virtual machine (i.e. the functions of the instruction set) is implemented in Python **and** Javascript. **Our goal is to translate all Python functions $f \\in F$ in Javascript**."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We want to compile Python functions **at runtime** to an intermediate representation (compiler **frontend**). Then, we generate Javascript code from the intermediate representation (compiler **backend**)."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Assumptions on the intermediage language"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We put a lot of restrictions on the semantics of our target functions $F$. By no means do we have to implement the full semantics of Python. The method we propose is only useful when our simplifying assumptions are satisfied."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "These assumptions concern every function $f \\in F$:\n",
      "\n",
      "1. A target function consists of a fixed and ordered sequence of operations that is entirely determined at runtime, and that does not change after initialization. That sequence is not easily obtainable statically (from source code), so we have to obtain it at runtime (initialization time).\n",
      "\n",
      "2. Every operation is of the form:\n",
      "\n",
      "$$y=\\phi(x_i, \\ldots, y_j, \\ldots)$$\n",
      "\n",
      "where $\\phi \\in I$, $x_i$ belongs to the set of arguments of $f$, and $y_j$ is a local variable defined previously in $f$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Compiler frontend"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Here, we describe the compiler frontend. The idea is the following. At runtime, we have a function $f(\\lambda_1, \\ldots, \\lambda_r) \\in F$ in Python (the arguments belong to the set of symbolic variables $V$). This function may involve many Python modules and functions, and it may use many dynamic features. However, we assume that we have control on the operations effectively performed on the inputs $\\lambda_1, \\ldots, \\lambda_r$. If this sequence is determined for good at initialization time, then we can try to deconstruct it dynamically."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "What is the sequence of operations performed on the arguments $\\lambda_1, \\ldots, \\lambda_r$?"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Theoretically, the goal is to call $f$ with all possible set of inputs (which is finite but huge), and save the results. Or we could infer the operation. More realistically, what we propose here is to call $f$ with custom Python objects as arguments. These objects reimplement many arithmetic operations so that we can track the sequence of operations performed on them."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This means that we cannot possibly do anything we want with these arguments. **Our major assumption is that any operation performed directly or indirectly on the arguments belongs to our instruction set $I$**."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "In effect, this operation is equivalent to decompile the operations of a stack machine into a syntax tree. However, we do not need to explicitly reconstruct the syntax tree. We can translate the stack machine operations in a language-independent representation, and compile it again in the target language."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Dynamic decompilation"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We propose here an implementation to recover the sequence of operations executed on a function's arguments by a compiled Python function available at runtime."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* We have a global list $L$ of operations that we construct dynamically. The initialization consists in calling a function $f$ with a set of special objects as arguments."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* We hijack the instructions of the instruction set to record, in order, every instruction executed. We record the identifier $\\phi$ of the instruction, and the identifiers $(\\alpha_{\\textrm{in}_1}, \\ldots, \\alpha_{\\textrm{in}_t})$ of the input variables. Then, we create a new GUID $\\beta_{\\textrm{out}}$ for the output variable returned by that instruction. We append $(\\beta_{\\textrm{out}}, \\phi, (\\alpha_{\\textrm{in}_1}, \\ldots, \\alpha_{\\textrm{in}_t}))$ to $L$. Then, we make the operation return the newly created variable $\\beta_{\\textrm{out}}$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* If the output of a previous operation is used in a subsequent operation, then we will know it since this output has a new globally unique identifier. In other words, an output of operation $n$ can become an input of operation $n+k$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* At the end, we have a list $L$ of operations. Every operation calls a function of the instruction set $I$, on one or several symbolic variables, and assigns the result to a new variable. The input variables are either elements of our initial set of variables $V$, or have been created by a previous operation of the list."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* Some operations of the list involve an effector method applied on symbolic variables. These operations do not return anything. These effector methods represent what the machine can do ultimately."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Compiler backend"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Finally, from our reconstructed list $L$, it is easy to generate dynamically code in a dynamic language such as Javascript. Specifically, we translate every item $(\\beta_{\\textrm{out}}, \\phi, (\\alpha_{\\textrm{in}_1}, \\ldots, \\alpha_{\\textrm{in}_t}))$ of the list into:"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "```javascript\n",
      "var beta_out = phi(alpha_in1, ..., alpha_int);\n",
      "```"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Since we initially assumed that all functions of our instruction set $I$ and the effector methods as well are implemented in Javascript, we can translate all our functions $f \\in F$ in Javascript."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Implementation\n",
      "\n",
      "We plan to create an implementation of this method, that could be reused in applications as is."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* To initialize the machine, the user provides the names of all instructions in $I$ (as `fun` or `module.fun` or `module.*`)."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* Then, the user provides a compiled function $f$ (as a Python object `f` of type `function`), and the machine returns an abstract representation of this function (specifically, the list $L$). We will monkey-patch the modules in $I$ dynamically, knowing the list of symbols to patch."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* The user can call a Javascript translator on this list $L$ to generate Javascript code dynamically."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "* In addition, it is up to the user to provide a reimplementation in Javascript of all instructions in $I$. The generated code will assume that these Javascript functions exist."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Test-Driven Development is probably a good way to implement this machine. We should start with simple examples."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Test 1"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Global variable keeping track of all operations.\n",
      "import pprint\n",
      "L = []\n",
      "def addop(fun, args, module='__builtin__'):\n",
      "    newvar = Symbol()\n",
      "    L.append((newvar, module, fun, args))\n",
      "    return newvar\n",
      "def resetop():\n",
      "    L[:] = []\n",
      "def showop():\n",
      "    pprint.pprint(L)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 1
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# Function to create a random symbol name.\n",
      "import uuid\n",
      "def newguid():\n",
      "    return uuid.uuid4().get_hex()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 2
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class Symbol(object):\n",
      "    \"\"\"A Symbol instance represents a symbolic variable. Any argument\n",
      "    of a function f \\in F will be replaced by a symbol with the\n",
      "    appropriate name.\n",
      "    \n",
      "    Many restrictions apply. For example,\n",
      "    one can do x+3 but not 3+x where x is an argument of the function.\n",
      "    The reason is that the Symbol instance implements many\n",
      "    operations such as __add__, __lt__, etc.\n",
      "    \n",
      "    \"\"\"\n",
      "    def __init__(self, name=None):\n",
      "        if name is None:\n",
      "            name = newguid()\n",
      "            self.generated = True\n",
      "        else:\n",
      "            self.generated = False\n",
      "        self.name = name\n",
      "        \n",
      "    def __add__(self, b):\n",
      "        return addop('__add__', (self, b))\n",
      "    \n",
      "    def __mul__(self, b):\n",
      "        return addop('__mul__', (self, b))\n",
      "    \n",
      "    def __repr__(self):\n",
      "        return '<Symbol {name}>'.format(name=self.name)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 3
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class MonkeySet(object):\n",
      "    \"\"\"This module replaces any module in the instruction set. When \n",
      "    a method of this module is called, this operation is recorded.\n",
      "    This class monkey patches all specified modules.\"\"\"\n",
      "    def __init__(self, module):\n",
      "        self.module = module\n",
      "        \n",
      "    def __getattr__(self, fun):\n",
      "        def myfun(*args):\n",
      "            return addop(fun, args, \n",
      "                  module=self.module)\n",
      "        return myfun"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's define the instruction set."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# These functions have to be redefined manually in Javascript.\n",
      "I = {'math': ['exp'], 'effectors': ['fun']}"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's define the function to translate."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import math\n",
      "class effectors(object):\n",
      "    @staticmethod\n",
      "    def fun(z):\n",
      "        print(\"Action is performed on {0:s}\".format(str(z)))\n",
      "def f(x):\n",
      "    \"\"\"We want to translate this function to Javascript\n",
      "    dynamically at runtime, without access to the source code.\"\"\"\n",
      "    effectors.fun(math.exp(x) + x)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's test this function."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "f(1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "Action is performed on 3.71828182846\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now, we are going to translate this function in Javascript."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "# We monkey-patch the instruction set.\n",
      "instruction_set = {module: MonkeySet(module) for module in I.keys()}\n",
      "for module, monkey in instruction_set.iteritems():\n",
      "    globals()[module] = monkey\n",
      "# We create one Symbol per function argument.\n",
      "x = Symbol('x')\n",
      "# We reinitialize the list of operations.\n",
      "resetop()\n",
      "# We call our function with a Symbol instance instead\n",
      "# of a normal argument, and we record at runtime\n",
      "# the list of operations performed.\n",
      "f(x)\n",
      "# We show the list of operations.\n",
      "showop()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "[(<Symbol 63ed9b2080dd433c801f147588bed087>, 'math', 'exp', (<Symbol x>,)),\n",
        " (<Symbol c91313e0e374496e8cc5f1c5f5965b45>,\n",
        "  '__builtin__',\n",
        "  '__add__',\n",
        "  (<Symbol 63ed9b2080dd433c801f147588bed087>, <Symbol x>)),\n",
        " (<Symbol 88df7aff344d417bbb255034db233477>,\n",
        "  'effectors',\n",
        "  'fun',\n",
        "  (<Symbol c91313e0e374496e8cc5f1c5f5965b45>,))]\n"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now, we can generate the Javascript code."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def getvar(arg):\n",
      "    \"\"\"Get the Javascript variable name of a Symbol.\"\"\"\n",
      "    if isinstance(arg, Symbol):\n",
      "        if arg.generated:\n",
      "            return \"var_\" + arg.name\n",
      "        else:\n",
      "            return arg.name\n",
      "    else:\n",
      "        return str(arg)\n",
      "    \n",
      "def genline(line):\n",
      "    \"\"\"Generate a line in Javascript that implements the operation\n",
      "    specified by a line in the operator list.\"\"\"\n",
      "    (newvar, module, fun, args) = line\n",
      "    if module == '__builtin__':\n",
      "        if fun == '__add__':\n",
      "            expr = \"{0:s} + {1:s}\".format(getvar(args[0]), getvar(args[1]))\n",
      "    else:\n",
      "        expr = \"{module}.{fun}({sargs})\".format(\n",
      "            module=module,\n",
      "            fun=fun,\n",
      "            sargs=', '.join([getvar(arg) for arg in args])\n",
      "            )\n",
      "    return \"var {newvar} = {expr};\".format(\n",
      "        newvar=getvar(newvar),\n",
      "        expr=expr,)\n",
      "    \n",
      "for line in L:\n",
      "    print genline(line)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "var var_63ed9b2080dd433c801f147588bed087 = math.exp(x);\n",
        "var var_c91313e0e374496e8cc5f1c5f5965b45 = var_63ed9b2080dd433c801f147588bed087 + x;\n",
        "var var_88df7aff344d417bbb255034db233477 = effectors.fun(var_c91313e0e374496e8cc5f1c5f5965b45);\n"
       ]
      }
     ],
     "prompt_number": 10
    }
   ],
   "metadata": {}
  }
 ]
}